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We consider the problem of optimally decoding a quantum error correction code — that is to find 
the optimal recovery procedure given the outcomes of partial "check" measurements on the system. 
In general, this problem is NP-hard. However, we demonstrate that for concatenated block codes, 
the optimal decoding can be efficiently computed using a message passing algorithm. We compare 
the performance of the message passing algorithm to that of the widespread blockwise hard decoding 
technique. Our Monte Carlo results using the 5 qubit and Steane's code on a depolarizing channel 
demonstrate significant advantages of the message passing algorithms in two respects. 1) Optimal 
decoding increases by as much as 94% the error threshold below which the error correction procedure 
can be used to reliably send information over a noisy channel. 2) For noise levels below these 
thresholds, the probability of error after optimal decoding is suppressed at a significantly higher 
rate, leading to a substantial reduction of the error correction overhead. 

PACS numbers: 03.67.Pp, 03.67.Hk, 03.67.Lx 



I. INTRODUCTION 

Quantum error correction (QEC) [1| and fault-tolerant 
quantum computation 0] demonstrate that quantum in- 
formation can in principle be stored and manipulated 
coherently for arbitrarily long times despite the presence 
noise. The general framework of QEC is the following. 
Redundancy is introduced by encoding the information 
of system S into a larger system S' . The image of 5* in S' 
characterizes a code, while a particular embedding of S 
into S' is called an encoding. The system S' is subjected 
to some noise. Partial measurements whose outcomes 
are known as the "error syndrome" are performed on S". 
Conditioned on this error syndrome, a recovery operation 
is applied to S' in order to restore its original informa- 
tion. This last step, called "decoding", is the subject of 
the present study. 

In the absence of structure in the code, we know from 
a classical result 3J that finding the optimal recovery 
is NP-hard. For practical purposes, one must either 
use codes with lots of structure — which typically offer 
poorer performances — or settle for suboptimal recovery. 
Residual errors after decoding are therefore of two vari- 
eties: those due to the information-theoretic limitations 
of the code and those arising from suboptimal decoding 
procedures. In the past decades, considerable progress 
has been made towards understanding this tradeoff in the 
classical setting (see e.g. d, Hj and references therein). 
Central to these advancements is the use of the mes- 
sage passing decoding algorithm pioneered by Gallager 
which often leads to near-optimal decoding. This tech- 
nique was recently introduced in the quantum realm by 
OUivier et al. i7j,l8j for the decoding of low density parity 
check (LDPC) codes (see also [9, 10] for related work). 

Concatenation of block codes is widely used in quan- 
tum information science and is a key component of al- 
most all fault tolerant schemes (a noticeable exception 
is topological quantum computing [HI). As the name 
suggests, the system S" that redundantly encodes the 



information of system S can itself be encoded in a yet 
larger system S" , adding an extra layer of redundancy. 
Provided the initial error rate is below a threshold value, 
every extra level of concatenation should reduce the prob- 
ability of error after decoding, so concatenation can in 
principle be repeated until the error is below any desired 
value. 

In this article, we demonstrate an efficient (23] mes- 
sage passing algorithm that achieves optimal (maximum 
likelihood) decoding for concatenated block codes with un- 
correlated noise. We numerically investi gate the message 
passing algorithm using the 5 qubit code [I^l and Steane's 
7 qubit code (l3| and compare their performances to 
the commonly used blockwise minimal-distance decoder 
(based on a local rather than global optimization). The 
advantages of the message passing algorithm are substan- 
tial. On the one hand, for the 5 qubit code used on a 
depolarizing channel, the message passing algorithm can 
correctly decode the information for a noise level up to at 
least 0.1885 (the exact threshold is probably the hashing 
bound « 0.189) compared to the values 0.1376 previously 
established using blockwise decoding [l3|. For Steane's 
code, this enhancement is even greater going from 0.0969 
[l3] to at least 0.188. On the other hand, away from these 
noise thresholds, the probability of error decreases at a 
significantly higher rate using optimal decoding. For in- 
stance, for a 0.1 depolarizing channel and using 4 levels 
of concatenation of the 5 qubit code, the probabilities 
that the blockwise decoding and the optimal decoding 
fail to correctly identify the error differ by more that 3 
orders of magnitude. As a consequence, a decoding error 
probability Pe l£ S for any S > can be achieved with a 
substantially reduced error correction overhead. 



II. STABILIZER FORMALISM 

Our presentation of the stabilizer formalism follows 
p^ . see [31 for the general theory. Denote hy X , Y 
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and Z the three Pauh matrices and by 11 the 2x2 iden- 
tity matrix. The group V\ is the multiplicative group 
generated by the Pauli matrices and the imaginary unit 
i. The n-qubit Pauli group Vn is the n-fold tensor prod- 
uct of V\ ■ Wc denote Xj the Pauli matrix X acting on 
the jth qubit for j = I, . . . ,n and similarly for Y and Z . 
Note that the Xj^s and the Zj's are a generating set of 
Vn, i.e. Vn = Xj, Zj). The Clifford group on n qubits 
Cn is the largest subgroup of the unitary group C/(2") 
that maps Vn to itself under the adjoint action. 

The encoding of k qubits into n qubits can be spec- 
ified by a matrix C € Cn- C is a unitary matrix act- 
ing on n qubits that are distributed in 3 different sets. 
The first k "logical" qubits contain the information to be 
encoded in the n qubits; the next u "stabilizer" qubits 
are set to the state |0)®"; and finally the remaining 
r = n — k--u "gauge" qubits are in arbitrary states. The 
image of the Pauli operators acting on the first k qubits 
are known as logical Pauh operators Xj — CXjC^ and 
Zj = CZjC'' . The image of the Z Pauh operators act- 
ing on qubits j — k + I, . . . , k + u are called stabilizer 
generators Sj — CZ^j^jC'^ whereas the image of the X 
operators acting on those qubits are called pure errors 
Tj = CXk+jC'^ ■ Finally, the image of the Pauli oper- 
ators acting on the remaining r qubits are called gauge 
operators = CXk+u+jC^ and g| = CZk+u+jC^ ■ 

The stabilizer generators Sj mutually commute, so can 
be simultaneously measured. The outcome of that mea- 
surement is called the error syndrome s G { — 1, !}"• Since 
the u stabilizer qubits are all in state |0) prior to encod- 
ing, we conclude that in the absence of noise the en- 
coded state should be a -1-1 eigenstate of all stabilizer 
generators, thus the error syndrome should be all ones. 
A non-trivial syndrome therefore indicates that an error 
has corrupted the register, and the task of decoding con- 
sists in finding the optimal recovery procedure given an 
error syndrome. 

III. DECODING 

To address the decoding problem, note that Vn = 
{i , Xj , Zj , Sj,Tj, gj , gj). In other words , any element 
E € Vn can be written, up to an irrelevant phase, as 

E - c{E)r{E)g{E), (1) 

where C{E) is a product of logical Pauli operators, T{E) 
is a product of pure errors, and Q{E) is a product of 
gauge operators and stabilizer elements. Moreover, this 
decomposition can be found by running the circuit C 
backward, which is efficient since C e C„ [3. T{E) is 
completely determined by the syndrome: Tj appears in 
T{E) if and only if the jth syndrome bit is —1. The value 
of G{E) is irrelevant because the information encoded in 
the n qubits is invariant under the action of any Q{E). 
This reflects the fact that the stabilizer qubits are ini- 
tially set to |0) and that the gauge qubits are in random 
states. Thus, to undo the effect of an error E, one needs 



to identify the most likely value of L = C{E) given s, or 
equivalently given T = T[E). 

For simplicity, we will focus on Pauli channels where er- 
rors E are elements of Vn distributed according to P{E). 
Given this probability P{E) over Vn one can compute the 
conditional probability P{L\T) = P{L,T)/P{T) using 



P(L,T) - 5]5[r(£;) = T]S[CiE) - L]PiE) (2) 

E 

^Y^P^^^LTG), (3) 

G 



where 5 denotes the indicator function and G takes all 
possible combinations of stabilizer generators and gauge 
operators. Given a finite block size n, these probabili- 
ties can be computed and the optimal decoding L[T) — 
argmax^{P(L|r)} can be evaluated. Decoding a block 
code thus consists of looking in a table containing the 
values of L{T) for each T. Typically — and in partic- 
ular for a non-degenerate code over the depolarization 
channel — L{T) corresponds to the minimal distance de- 
coder C{E{T)) where E{T) is the error acting on the 
fewest number of qubits and that is compatible with the 
observed syndrome. 

Concatenation is realized by encoding the n qubits of 
the code in an other code. There is no need for this other 
code to be identical to the original one. However to sim- 
plify the presentation, we will assume that the same code 
is used at every concatenation layer and that it encodes 
a single qubit in n qubits; generalizations are straight- 
forward. This procedure can be repeated t. times at the 
expense of an exponentially growing number of physi- 
cal qubits r/ . The number of stabilizer generators grows 
roughly as un^~^ (it is a geometric sum); thus the syn- 
drome can take 2"" different values. Thus, even for 
moderate values of £, it is not feasible to construct a 
lookup table giving the optimal decoding procedure for 
each syndrome value. 

What is generally done to circumvent this double ex- 
ponential blowup is to apply the optimal recovery in- 
dependently for each concatenation layer (see e.g. (igI . 
Chap. 6] and references therein). One first measures 
the syndrome from each of the n^~^ blocks of n qubits 
of the last layer of concatenation, and optimally decodes 
them using the lookup table. One then moves one layer 
up and applies the same procedure to the blocks 
of the second-to-last layer, etc. When the initial error 
rate is below a certain threshold value, the probability 
Pe that this procedure fails to correctly identify C{E) 
decreases doubly-exponentially with £. Hence, this de- 
coding scheme based on hard decisions for each concate- 
nation layer is efficient, leads to a good error suppression, 
but is nonetheless suboptimal. 
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A. Optimal decoding 



1,1 




Let Sm e {—1,1}" be the syndrome of the jth block 

of the mth concatenation layer. Denote s,^ the collec- 
tion of syndromes whose stabilizers act non-trivially on 

I 



Above, iS(L) denotes the syndrome associated to the er- 
ror pattern L G Vn ■ This series of manipulations repeat- 
edly uses Bayes' rule and the fact that the syndrome and 
logical error of level m are completely determined given 
the logical errors of layer m -\- \. The last step relies on 
the important assumption that the channel is memory- 
less, or more specifically, that the noise model does not 
correlate qubits across distinct blocks (errors on qubits 
in the same block could be correlated). 




Equation (jl]) shows that by conditioning on the logical 
errors of each concatenation layer, the factor graph asso- 
ciated to the function P(Li|si) is a tree, as depicted in 
the above graph. We have thus reduced optimal decoding 
to a SUM PRODUCT problem (known as tensor network 
contraction in quantum information science ^V^) on a 



the physical qubits associated to the j th block of the mth 
concatenation layer: these sets can be defined recursively 
by s^^^ = {s^^} U {U^"j„_j+iS^^+i} with the initializa- 
tion s^-'-' = sf\ Finally, denote s,„ = ^^=i ^ ^ni all the 
syndromes from the layers m to ^. (See the above figure 
for a pictorial representation of Sm\ and s^.) Then, 
si is the set of all syndromes and maximum likelihood de- 
coding consists in finding argmax^^P(Li|si). This prob- 
ability can be factorized by conditioning on the logical 
errors of the second layer L2 = (^2^'' , • • ■ , L^2^): 



(4) 



tree graph which can be solved exactly and efficiently in 
the number of variables using a message passing algo- 
rithm (also known as behef propagation); see 0,151, iisf . 
and references therein. Let us describe this algorithm in 
a general setting. 

The factor graph is a bipartite graph, and vertices from 
the two partitions are decorated with circles and boxes. 
Circle vertices are labeled c = 1 , . . . , TV and each one 
carries a variable Xc with value in a discrete set. Box 
vertices are labeled 6 = 1, ■ ■ ■ , Af, and each one contains 
a function /{, that depends on the variables Xc from the 
adjacent circles c G N{b), collectively denoted — {xc ■ 
c e A/'(6)}. The goal is to compute marginals 



{xi^...x n}\xc b—1 



where \xc indicates that Xc is omitted from the set and 
Z is a normalization factor. To this end, messages qc^b 
are passed from the circles to the boxes and messages 
Tfc^c are passed from the boxes to the circles following 



P(Li|si) = ^P(Li|si,L2)P(L2|si) 



P(.Si|L2,S2)P(L2,S2) 
P(S1,S2) 



> 5[Li = C{L2msi ^ S{L2)\ — ^ — 

^ P(Sl,S2) 



E 



6[Li = C{L2ms, = SjU) 
P{si\s2) 

I 
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the rules 



ft'eA/'(c)\b 



I n 9c 

c'eM(b)\c 



b{Xc' 



(6) 



, (7) 



where M{b)\c means all neighbors of b excluding c, and 
similarly for J\f{c)\b. Note that these messages are func- 
tions of the discrete variables Xc (i.e. they are vectors). 
The (jc— >6 messages are initialized to the constant function 
1. For a tree graph, the desired marginal is obtained from 
these messages after a number of steps equal to the depth 



carry logical op- 
carries the func- 



Si) 

'm+l 



of the variable Xc and is given by f{xc] 
where fc is a normalization factor. 

In the case of interest, circles 
erators and a box labeled m, j 

tion 5[L^l, = 

. . . , Lm^^)], where the syndrome is fixed by 
the measurements. To complete the picture, extra box 
vertices carrying the function Eq. ([3]) need to be attached 
to the bottom leaves of the graph. The factor p(si|s2)^^ 
can be evaluated by normalizing the obtained distribu- 
tion. Thus, we can efficiently evaluate P(Li|si) [2^, and 
the optimal recovery is the Li maximizing this function. 

The advantage of the message passing algorithm over 
the minimal distance decoder comes from the fact that 
it does not throw away useful information (l9| . Instead 
of computing the most likely recovery and passing it on 
to the next level of coding, the entire list of probability 
of possible recoveries, conditioned on the observed syn- 
drome, is passed on. In other words, the original channel 
is composed with the syndrome measurement, and pro- 
jected onto the logical algebra to yield a "conditionally 
renormalized" channel. 



IV. NUMERICAL RESULTS 

Following the tradition for benchmarking QEC tech- 
niques, we investigate the performance of the message 
passing decoding algorithm using a depolarization chan- 
nel, where each qubit is independently subjected to the 
channel 

£p{p) = {l-p)p+^{XpX + YpY + ZpZ). (8) 

We use the 5 qubit code [1^ concatenated with itself 
up to £ = 10 times, for an overhead of 9,765,625 phys- 
ical qubits per logical qubit. Pauli errors E G Vni are 
generated by picking each n single-qubit operator inde- 
pendently according to the probability P(ll) — 1 ~ p, 
P{X) = P{Y) = P{Z) = p/3. The associated logical 
error C{E) and syndromes S{E) are computed exactly. 
These syndromes are used by a blockwise decoding rou- 
tine yielding an estimate Lbw and by a message passing 
routine yielding the optimal decoding L. A decoding is 



declared incorrect when its estimate differs from C{E). 
This is repeated a large number of times (10"* — 10*) to 
evaluate the probability pe that the decoding gives an 
incorrect estimate. 




Concatenation 



FIG. 1: Monte Carlo results for the 5 qubit code showing 
the probability of erroneous decoding pe as a function of the 
level of concatenation £ for different depolarization rate p = 
0.13, 0.15, 0.17, 0.18, 0.1885, and 0.19. The diamonds are 
from the message passing algorithm, and the circles are from 
the blockwise decoding. All data are from samples of 2 ■ 10* 
encoded qubits. 



Figure [T] shows the probability of incorrect decoding 
Pe for both the blockwise and the optimal decoding as a 
function of the level of concatenation £ and for different 
channel parameters p ranging from 0.13 to 0.19. For the 
blockwise decoding, pe ceases to decrease with £ for values 
of p > 0.15. This reflects the fact that the threshold of 
this decoding technique for this particular code is about 
0.1376 14], so all curves except the 0.13 one are above the 
threshold. On the other hand, optimal decoding succeeds 
in decreasing the error probability for values of p up to 
at least 0.1885, but appears to fail at p ~ 0.19. We 
conjecture that the exact value of this threshold is the 
hashing bound ~ 0.189, where the single-qubit coherent 
information vanishes and is the highest threshold any non 
degenerate code can achieve [23|. Results obtained from 
Steane's code [isj show a quite similar behavior, with at 
least 94% increase of threshold going from 0.0969 [2(J| to 
at least 0.188 and appears to fail at 0.1885. 

An interesting feature of the pe {() curves obtained from 
optimal decoding is their non-monotonicity. Blockwise 
decoding, on the other hand, always yields monotonic 
curves for this type of channel; thus its global behavior 
under concatenation can be predicted from a single level 
of coding. This is because decoding is performed inde- 
pendently on each concatenation layer. With the optimal 
decoder, information about the syndromes is propagated 
from one layer of concatenation to the next through the 
conditionally-renormalized channel that ceases to be de- 
polarizing and varies from one qubit to the other. Thus, 
non-monotonicity of the Pe (i) curves is a signature of the 
global optimization performed by the message passing 
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Concatenation 



FIG. 2: As in FiglUfor p = 0.1 and 0.05. Diamonds obtained 
from samples of 10* encoded qubits. Circle were produced 
using an exact numerical technique similar to that of Ref. [ij . 



algorithm. 

Figure [2] shows the behavior of pe as a function of £ 
away from the threshold values, i.e. in the natural op- 
erating regime of the code. Again, the advantages of 
the message passing algorithm are considerable. After 4 
rounds of concatenation forp = 0.1, message passing fails 
with a probability of roughly 10~^, whereas this proba- 
bility is well above 10^"^ for blockwise hard decoding. It 
takes 6 layers of concatenation for the blockwise decod- 
ing to reach comparable performances. Again, results 
obtained from Steane's code show an even larger gap. 

Finally, we once again stress that the message passing 
outputs the probability of an error L rather than a partic- 
ular value of L. A hard decision can then be made based 
on this probability. We observe that when decoding suc- 
ceeds, P{L) is typically very close to one (e.g. 0.999 for 
^ = 3) whereas when it fails it is relatively low (typi- 
cally 0.7); the algorithm knows that it is failing. This 
"flagging" of errors offers a great advantage when post- 
selection is an option. The possibility of operating the 



algorithm with soft inputs, i.e. noisy syndrome measure- 
ments, is also of interest in several circumstances. 



V. CONCLUSION 

We have demonstrated an efficient message passing al- 
gorithm for the optimal decoding of concatenated quan- 
tum block codes on a memoryless channels. Numerical 
results show substantial benefits of our approach over 
the widely used blockwise hard decoding, including an 
increase of error thresholds and a greater error suppres- 
sion rate. Message passing algorithms have been used on 
graphs with loops (describing e.g. LDPC codes, turbo- 
codes, or channels with memory) and often yield near- 
optimal decoding. The quantum generalization of these 
schemes, includin g qu antum LDPC codes [!,[§] and quan- 
tum turbo-codes [22j . are promising avenues for the re- 
alization of a quantum information technologies. Tech- 
niques reminiscent of message passing have been used to 
beat the hashing bound but were not efficiently imple- 
mentable 20, 21]: efficient decoding may now be within 
reach using our techniques. A "hard" message passing 
scheme was also used in [2I] to obtain high fault-tolerant 
error thresholds: a full- fledge message passing scheme — 
although not optimal for correlated errors that are typi- 
cally present in fault-tolerant schemes — should further 
improve this threshold and may significantly reduce the 
resource overhead. 
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